-
Notifications
You must be signed in to change notification settings - Fork 664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hbond analysis: fixed incorrect residue handling with trailing numbers #1339
Conversation
Will fix the doc build error https://travis-ci.org/MDAnalysis/mdanalysis/jobs/230575555 EDIT: no idea why napoleon barffs on a lone Note heading; it also does not like a lone See Also – maybe because it is a property annotation?? Anyway, need to change back to reST. |
Only the numpy dev test fails. I'll rebase the whole thing later and then ask for reviews. |
Could someone please review? The travis failure is due to the stuff discussed in #1334 and is harmless. |
|
||
.. deprecated:: 0.15.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you somewhere add a developer note how to remove these things. I must have missed it looking for deprecations to remove before the last release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, added the following
# REMOVAL (DEPRECATION) instructions for 0.17.0: "1-based indices" (aka "idx" vs 0-based "index")
# - remove the warning
# - replace 'deprecated 0.15.0' with 'versionchanged 0.17.0' (and adjust text)
# - update docs for "timeseries" by removing all "idx (1-based)" entries:
# one hydrogen bond should now look like
# <donor index (0-based)>, <acceptor index (0-based)>, <donor string>, <acceptor string>,
# <distance>, <angle>
# - update docs for "table": removed *_idx and renumber
# - in run(): removed the 1-based indices (h.index + 1, a.index + 1) from frame_results:
# for instance, should now read (two occurences!)
# frame_results.append(
# [h.index, a.index,
# (h.resname, h.resid, h.name),
# (a.resname, a.resid, a.name),
# dist, angle])
# - generate_table(), count_*(), timesteps_by_type():
# remove any donor_idx, acceptor_idx variables and fields.
# - update tests (change indices in analysis/test_hbonds.py)
# - update any docs in this file that mention "1-based"
#
# I suggest to keep this removal a separate commit. [@orbeckst]
*selection2* handles donors and acceptors: If *selection1* contains | ||
'both' then *selection2* will also contain *both*. If *selection1* | ||
is set to 'donor' then *selection2* is 'acceptor' (and vice versa). | ||
value for `selection1_type` automatically determines how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need double backticks. The single ones will result in a itallic formatting. We have so far always opted for a monospace-coding formatting with double backticks for variable names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't changed this yet because this is how numpy docs are written (IIRC) and it is consistent with the Napoleon formatting.
and they are targeted for removal in 0.17.0. | ||
|
||
""" | ||
return [[self._reformat_hb(hb) for hb in hframe] for hframe in self._timeseries] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is here the fix for the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real fix is
frame_results.append(
[h.index + 1, a.index + 1, h.index, a.index,
(h.resname, h.resid, h.name),
(a.resname, a.resid, a.name),
dist, angle])
where we store atom information as a tuple (h.resname, h.resid, h.name)
and (a.resname, a.resid, a.name)
instead of a mangled name. These unambiguous data are then stored as _timeseries
. The timeseries
is generated to look like the previous output. Importantly, the other methods (generate_table
, count_*
, timeseries_by_type
) now all use the unambiguous data to generate their own data structures. Using unambiguous time series data fixes the issue.
The _reformat_hb()
is just a way to return the same timeseries
as in previous versions.
Do you think that this needs to be commented?
(In principle we could just store atom indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to comment this. But a hint about this for a review with lots of docs changes would be nice next time.
@@ -1083,47 +1228,52 @@ def save_table(self, filename="hbond_table.pickle"): | |||
self.generate_table() | |||
cPickle.dump(self.table, open(filename, 'wb'), protocol=cPickle.HIGHEST_PROTOCOL) | |||
|
|||
def _has_timeseries(self): | |||
has_timeseries = (self._timeseries is not None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no braces needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like them for readability but I am happy to remove them – if you didn't care you wouldn't have taken the time to comment.
…drogenBondAnalysis (issue #801) - fixes #801 - analysis.hbond.hbond_analysis.HydrogenBondAnalysis now correctly stores and parses donor and acceptor names and is not tripped up by resnames that end in numbers, such as TIP3 - store timeseries as _timeseries with donor and acceptor atom identifiers as tuples instead of strings (avoids the use of fragile lib.util.parse_residue()) - made timeseries a property that is generated on the fly so that old code does not break, but it is not cached (to avoid memory consumption for big trajectories) and so users should cache themselves if needed - added tests - rewrote parts of the docs and added notes on use of timeseries - updated DEPRECATION warning for 1-based indices to 0.17.0 (should have been removed in 0.16.0 but we forgot)
- mostly numpy style (whenever possible): Apparently, napoleon does not like a single Notes and See Also section, need to use reST. - named the 1-based indices "idx" in the docs. - added example for analysis - describe convenience analysis functions - how to use pandas
Use atom.index instead of atom.index+1 internally and for debug output.
Feel free to squash. |
Fixes #801
Changes made in this Pull Request:
table
_timeseries
timeseries
that reproduces the previous behavior (but might come at speed penalty because it is not cached on purpose)PR Checklist